Concatenate with OME-Zarr v0.5 and sharding #104

ziw-liu · 2025-07-03T18:45:45Z

To be investigated: multiprocessing based parallelism is not compatible with the asyncio-based thread parallelism that zarr-python is designed for and appears to be a bit slower.

#96 (comment)

ziw-liu · 2025-07-03T22:16:19Z

As of 83cd243 converting a 282 GB dataset (325 GB decompressed) took 7 minutes on 2 nodes with 64 CPUs each. As a reference converting a 65 GB dataset (183 GB decompressed) takes 2 minutes on 16 CPUs when using thread parallelism.

ziw-liu · 2025-07-10T21:20:12Z

Ran into zarr-developers/zarr-python#3221.

ziw-liu · 2025-07-11T00:22:28Z

As of 83cd243 converting a 282 GB dataset (325 GB decompressed) took 7 minutes on 2 nodes with 64 CPUs each. As a reference converting a 65 GB dataset (183 GB decompressed) takes 2 minutes on 16 CPUs when using thread parallelism.

606a0c4 now taking about 2 minutes.

ziw-liu · 2025-08-09T00:25:07Z

pyproject.toml

  "scikit-learn",
 ]

 [project.optional-dependencies]


Is this any different than the one from PyPI?

Good question - I'm guessing no? @tayllatheodoro may know better

ieivanov · 2025-08-09T01:49:21Z

biahub/concatenate.py

-    sbatch_filepath: str = None,
+    sbatch_filepath: str | None = None,
    local: bool = False,
+    block: bool = False,


What was the motivation for including the block parameter? Was it useful during testing?

When running locally there isn't a good way to check if the jobs (processes) have finished. It is also useful for testing.

ieivanov · 2025-09-04T17:34:27Z

Current plan is that this PR will be merged after czbiohub-sf/iohub#301, updating the iohub dependency to the main branch.

srivarra

LGTM, I was able to run biahub concatenate -c rechunk.yml -o test.zarr -sb sbatch.sh on a dataset which hasn't been converted from OME-NGFF v0.4/Zarr V2 to OME-NGFF v0.5/Zarr V3 over here /hpc/projects/intracellular_dashboard/organelle_dynamics/rerun/2025_04_15_A549_H2B_CAAX_ZIKV_DENV/2-assemble/zarr-v3.

ziw-liu · 2025-09-12T00:42:27Z

Blocked until we bump waveorder:

ERROR: Cannot install None, biahub and biahub[dev]==0.1.0 because these package versions have conflicting dependencies.
The conflict is caused by:
    biahub 0.1.0 depends on iohub<0.4 and >=0.3.0a2
    biahub[dev] 0.1.0 depends on iohub<0.4 and >=0.3.0a2
    waveorder 3.0.0a1 depends on iohub<0.3 and >=0.2

ziw-liu · 2025-09-12T01:01:04Z

Another blocker is napari-psf-anlysis -> bfio -> zarr<3.

ziw-liu added 6 commits July 3, 2025 10:07

add zarrs and organize dependency groups

5788385

configurable sharding

838e46b

update chunking test

cc8bf6d

use clean env helper

dd0771c

#96 (comment)

update example config

8c162c5

disable threading for the zarrs codec

83cd243

ziw-liu added 7 commits July 9, 2025 15:56

test variable sharding in time

0ee2e33

print the correct cluster name

edef626

allow blocking

203054a

fix typing

25c5562

wip: test values of the concatenated array

01883f4

Merge branch 'main' into concat-zarr3

47db527

fix monitoring

d25a090

ziw-liu added 4 commits July 10, 2025 15:21

remove zarrs codec

f7e3555

tweak resource estimation

65cbea5

block in testing

71f5fae

require tensorstore

606a0c4

update dependency groups

8394696

ziw-liu mentioned this pull request Aug 2, 2025

replace wait_for_jobs_to_finish with executor.wait() #121

Closed

ieivanov added 2 commits August 8, 2025 14:57

combine context managers

ca97861

Merge branch 'main' into concat-zarr3

47605dc

ziw-liu commented Aug 9, 2025

View reviewed changes

ieivanov reviewed Aug 9, 2025

View reviewed changes

mattersoflight added this to the Data Infrastructure milestone Aug 14, 2025

ieivanov added 3 commits August 15, 2025 16:13

ultrack lazy import

f2cc948

Merge remote-tracking branch 'origin/main' into lazy_imports

5d73bd4

style

b6442be

Merge remote-tracking branch 'origin/lazy_imports' into concat-zarr3

acd4c14

edyoshikun requested a review from srivarra August 18, 2025 16:18

Merge branch 'main' into concat-zarr3

0af810d

ieivanov mentioned this pull request Sep 4, 2025

Parallel writing to shards czbiohub-sf/iohub#311

Merged

srivarra reviewed Sep 6, 2025

View reviewed changes

point to the pre-release

980f8b8

edyoshikun mentioned this pull request Sep 16, 2025

update to iohub and zarr v3 mehta-lab/VisCy#292

Open

srivarra mentioned this pull request Oct 14, 2025

Vender Napari PSF Analysis Dependencies #172

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Concatenate with OME-Zarr v0.5 and sharding #104

Concatenate with OME-Zarr v0.5 and sharding #104

Uh oh!

ziw-liu commented Jul 3, 2025 •

edited

Loading

Uh oh!

ziw-liu commented Jul 3, 2025

Uh oh!

ziw-liu commented Jul 10, 2025

Uh oh!

ziw-liu commented Jul 11, 2025

Uh oh!

ziw-liu Aug 9, 2025

Uh oh!

ieivanov Aug 9, 2025

Uh oh!

ieivanov Aug 9, 2025

Uh oh!

ziw-liu Aug 11, 2025

Uh oh!

ieivanov commented Sep 4, 2025

Uh oh!

srivarra left a comment

Uh oh!

ziw-liu commented Sep 12, 2025

Uh oh!

ziw-liu commented Sep 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Concatenate with OME-Zarr v0.5 and sharding #104

Are you sure you want to change the base?

Concatenate with OME-Zarr v0.5 and sharding #104

Uh oh!

Conversation

ziw-liu commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ziw-liu commented Jul 3, 2025

Uh oh!

ziw-liu commented Jul 10, 2025

Uh oh!

ziw-liu commented Jul 11, 2025

Uh oh!

ziw-liu Aug 9, 2025

Choose a reason for hiding this comment

Uh oh!

ieivanov Aug 9, 2025

Choose a reason for hiding this comment

Uh oh!

ieivanov Aug 9, 2025

Choose a reason for hiding this comment

Uh oh!

ziw-liu Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

ieivanov commented Sep 4, 2025

Uh oh!

srivarra left a comment

Choose a reason for hiding this comment

Uh oh!

ziw-liu commented Sep 12, 2025

Uh oh!

ziw-liu commented Sep 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ziw-liu commented Jul 3, 2025 •

edited

Loading